Relation Extraction for Semantic Intranet Annotations

نویسندگان

  • Lucia Specia
  • Claudio Baldassarre
  • Enrico Motta
چکیده

We present an approach for ontology driven extraction of relations from texts aimed mainly to produce enriched semantic annotations for the Semantic Web. The approach exploits linguistic and empirical strategies, by means of a pipeline method involving processes such as a parser, part-of-speech tagger, named entity recognition system, and pattern-based classification, and resources including ontology, knowledge and lexical databases. A preliminary evaluation with 25 sentences showed that the use of knowledge intensive resources and strategies together with corpus-based techniques to process the input data allows identifying and discovering relevant relations between known and new entity pairs mentioned in the text. Besides semantic web annotations, the system can be used for other tasks, including ontology population, since it identifies new instantiations of existent relations and entities, and ontology learning, since it discovers new relations, which are not part of the ontology.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Light-Weight Semantic Web: Integrating Information Extraction and Information Retrieval for Heterogeneous Environments

Today’s Web, large intranets and even the documents collected by a single user are enormous sources of distributed, heterogeneous information that cannot be easily mastered. Syntactical and semantical differences as well as missing semantic annotations make effective query evaluation on such corpora a hard task. The Semantic Web aims at providing a standard for semantic annotations, but has not...

متن کامل

Towards a Wiki Interchange Format (WIF) Opening Semantic Wiki Content and Metadata

Wikis are increasingly being used in world-wide, intranet and even in personal settings. Unfortunately, current wikis are data islands: people can read and edit them, but machines can only send around text strings without structure. Wiki migration, publishing from one wiki to another one and free choice of syntax hold back broader wiki usage. We define a wiki interchange format (WIF) that allow...

متن کامل

A hybrid approach for relation extraction aimed to semantic annotations

We present an approach for relation extraction from texts aimed to enrich the semantic annotations produced by a semantic web portal. The approach exploits linguistic and empirical strategies, by means of a pipeline method involving processes such as a parser, part-of-speech tagger, named entity recognition system, pattern-based classification and word sense disambiguation models, and resources...

متن کامل

Semantic Web Technologies for Analysis of Transcriptome

The Acacia team studies knowledge management through the building of an organizational memory, that we propose to materialize an organizational memory through an “organizational semantic web” constituted of: • resources : they can be documents (in various formats such as XML, HTML, or even classic formats), but these resources can also correspond to people, services, software or programs, • ont...

متن کامل

Annotating Relation Mentions in Tabloid Press

This paper presents a new resource for the training and evaluation needed by relation extraction experiments. The corpus consists of annotations of mentions for three semantic relations: marriage, parent–child, siblings, selected from the domain of biographic facts about persons and their social relationships. The corpus contains more than one hundred news articles from Tabloid Press. In the cu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006